Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow provider-specific data types #529

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from

Conversation

merkys
Copy link
Member

@merkys merkys commented Jun 14, 2024

This PR attempts to generalize #436 by allowing to define provider-specific data types. The following conditions for such data types apply:

  • Their values have to be expressed as string literals both in responses and in filters;
  • They have to be named using provider-specific prefixes;
  • Properties can only be compared with values of the same type (as per specification up til now).

The latter condition might be too strict for some uses, but since casting is unclear at the moment, let us leave such extensions for future.

Merging this PR will allow to reassign #392 and #436 (SMILES) to _cheminfo_ namespace where it can become a property with special set of rules for comparison.

@merkys merkys requested review from blokhin, rartino, vaitkus and ml-evs June 14, 2024 14:11
optimade.rst Outdated Show resolved Hide resolved
optimade.rst Outdated
@@ -215,7 +215,7 @@ Hence, entry properties are described in this proposal using
context-independent types that are assumed to have some form of
representation in all contexts. They are as follows:

- Basic types: **string**, **integer**, **float**, **boolean**, **timestamp**.
- Basic types: **string**, **integer**, **float**, **boolean**, **timestamp**, database-provider-specific or definition-provider-specific data type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same database-provider-specific or definition-provider-specific data type phrase is repeated multiple times throughout the specification. I like that we are being really specific, but maybe we could defined a term somewhere (e.g. custom data type) and then refer to it troughout the text?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point, I will check the term definition section to see how well it would fit there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On another point: I think it seems odd to list provider-specific data types under "Basic types". To the extent that we do the separation into Basic vs. list/dictionary on the basis of "contains one thing" vs "many things", it also isn't clear to me why the defined types would have to be seen as the former. I suggest we move the segment about database-specific datatypes below the definitions of the datatypes that are explicitly defined by the standard.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea is to allow only those provider-specific data types that can be expressed as strings (e.g., symmetry operators, complex numbers etc). Timestamp type which OPTIMADE already has could fall under the same category of strings with internal semantics. That is why I lumped provider-specific data types together with them. But I agree that they could be split off to a separate segment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same database-provider-specific or definition-provider-specific data type phrase is repeated multiple times throughout the specification. I like that we are being really specific, but maybe we could defined a term somewhere (e.g. custom data type) and then refer to it troughout the text?

What about calling them "namespace-specific data types"? The specification already has a section "Namespace Prefixes" which is later split into database-specific and definition provider-specific ones, thus I think "namespace" is a right term to use.

merkys and others added 2 commits June 19, 2024 15:45
optimade.rst Outdated Show resolved Hide resolved
Co-authored-by: Antanas Vaitkus <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants